Automatic Text Summarization Using a Machine Learning Approach
نویسندگان
چکیده
In this paper we address the automatic summarization task. Recent research works on extractive-summary generation employ some heuristics, but few works indicate how to select the relevant features. We will present a summarization procedure based on the application of trainable Machine Learning algorithms which employs a set of features extracted directly from the original text. These features are of two kinds: statistical – based on the frequency of some elements in the text; and linguistic – extracted from a simplified argumentative structure of the text. We also present some computational results obtained with the application of our summarizer to some well known text databases, and we compare these results to some baseline summarization procedures.
منابع مشابه
A survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملAn Approach for Concept-based Automatic Multi- Document Summarization using Machine Learning
Text Summarization is compressing the source text into a shorter version preserving its information content and overall meaning. It is very complicated for human beings to manually summarize large documents of text. Text summarization plays an important role in the area of natural language processing and text mining. Many approaches use statistics and machine learning techniques to extract sent...
متن کاملLEXA: Towards Automatic Legal Citation Classification
In this paper we present our approach towards legal citation classification using incremental knowledge acquisition. This forms a part of our more ambitious goal of automatic legal text summarization. We created a large training and test corpus from court decision reports in Australia. We showed that, within less than a week, it is possible to develop a good quality knowledge base which conside...
متن کاملUsing Machine Learning Algorithms for Automatic Cyber Bullying Detection in Arabic Social Media
Social media allows people interact to express their thoughts or feelings about different subjects. However, some of users may write offensive twits to other via social media which known as cyber bullying. Successful prevention depends on automatically detecting malicious messages. Automatic detection of bullying in the text of social media by analyzing the text "twits" via one of the machine l...
متن کاملEmotion Detection in Persian Text; A Machine Learning Model
This study aimed to develop a computational model for recognition of emotion in Persian text as a supervised machine learning problem. We considered Pluthchik emotion model as supervised learning criteria and Support Vector Machine (SVM) as baseline classifier. We also used NRC lexicon and contextual features as training data and components of the model. One hundred selected texts including pol...
متن کامل